Minimum cost based phoneme class detection for improved iterative speech enhancement
نویسندگان
چکیده
It is known that degrading acoustic noise innuences speech quality across phoneme classes in a non-uniform manner. This results in variable quality performance for many speech enhancement algorithms in noisy environments. To address this, a hidden-Markov-model phoneme classiica-tion procedure is proposed which directs single channel speech enhancement across individual phoneme classes. The procedure performs broad phoneme class partitioning of noisy speech frames using a continuous-mixture hidden-Markov-model recognizer in conjunction with a cost based decision process. Cost functions are assigned which weigh errors between phoneme classes that are perceptually different (e.g., vowels versus fricatives, etc.). Once noisy speech frames are partitioned, iterative speech enhancement based on all-pole parameter estimation with inter and intra-frame spectral constraints (Auto:I,LSP:T) is employed. The phoneme class directed enhancement algorithm is evaluated using TIMIT speech data, and shown to result in substantial improvement in objective speech quality over a range of signal-to-noise ratios and individual phoneme classes. The algorithm is also shown to possess consistent quality improvement in a speaker independent scenario.
منابع مشابه
Markov Model Based Phoneme Class Partitioning for ImprovedConstrained Iterative Speech
Research has shown that degrading acoustic background noise innuences speech quality across phoneme classes in a non-uniform manner. This results in variable quality performance of many speech enhancement algorithms in noisy environments. A phoneme classiication procedure is proposed which directs single-channel constrained speech enhancement. The procedure performs broad phoneme class partitio...
متن کاملClass constrained ROVER based speech enhancement
A phoneme class based speech enhancement algorithm is proposed that is derived from the family of constrained iterative enhancement schemes. The algorithm is a Rover based solution that overcomes three limitations of the iterative scheme. It removes the dependency of the terminating iteration, employs direct phoneme class constraints, and achieves suppression of audible noise. In the Rover sche...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملPhoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...
متن کاملText - Directed Speech Enhancement Employing
There are many situations where non-real-time speech enhancement is required. For such applications, employing any available a priori knowledge can lead to more eeective enhancement solutions. In this study, a novel text-directed speech enhancement algorithm is developed for usage in non-real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994